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Abstract 

Social activities display bursty behavior characterized by heavy-tailed inter-event time distribu¬ 
tions. We examine the bursty behavior of airplanes’ arrivals in hub airports. The analysis indicates 
that the air transportation system universally follows a power-law inter-arrival time distribution 
with an exponent a = 2.5 and an exponential cutoff. Moreover, we investigate the mechanism of 
this bursty behavior by introducing a simple model to describe it. In addition, we compare the 
extent of the hub-and-spoke structure and the burstiness of various airline networks in the system. 
Remarkably, the results suggest that the hub-and-spoke network of the system and the carriers’ 
strategy to facilitate transit are the origins of this universality. 
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I. INTRODUCTION 




Considerable attention has been paid to the dynamics of social activities and networks 
The availability of a large amonnt of data has recently enabled researchers to analyze the 
detailed strnctnres of social activity patterns. Burstiness has recently been considered as a 
fnndamental pattern of social phenomena: the probability density fnnctions (PDFs) of inter¬ 
event times (lETs) of many social activities are characterized by heavy tails. This is evidence 
of the non-Poissonian natnre of social activities, which indicates that each social activity is 
strongly correlated with other activities. The heavy-tailed strnctnre of the lET distribntion 
is well-approximated by a power-law tail with an exponential cntoff p(r) = where 

a is an exponent of the power law 


7|. This strnctnre is nniversally observed in varions 
phenomena inclnding hnman activities, snch as sending emails and library loans jb, 8-12|. 
and natnral phenomena, snch as a nenron’s interspike and an earthqnake’s shock in¬ 
tervals [^. Fnrthermore, the bnrsty behaviors of systems have a strong inflnence on the 


collective phenomena in their networks ISHlSj. The effect of bursts spreading processes on 


networks, has recently been studied empirically, numerically, and analytically 


19 


221. These 


studies indicate that burstiness is a signihcant factor in understanding social phenomena on 
networks. 

Moreover, proposing reasonable explanations and models for these activities has been a 


signihcant issue 


23| . One such model describing bursty behavior is a queueing model in 


which an individual prioritizes some important tasks based on the assumption that 
have a wide range of tasks and attempt to deal with the urgent ones immediately js. 


lumans 


24 


26|. 


Another possible explanation is a cascading Poisson process with a circadian rhythm. Once 
one engages in an action such as sending an e-mail, one continuously repeats the action for 
a while although the initiation of the actions is independent of other actions; this is called 
a cascading non-homogeneous Poisson process. Malmgren et al. proposed a model in which 
agents start a cascade of actions at a rate determined by their circadian rhythms j^. Jo 
et al. claimed, based on empirical data analysis, that human mobile phone communication 
has bursts without a circadian rhythm 2^ . Some argue that bursts stem from the memory 


effect 


3-31|. 


However, these possible explanations of bursty behaviors in social phenomena 


are difficult to apply to components of social systems such as the air transportation system. 
The air transportation system has attracted considerable attention because of its im- 
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portance to mobility. T 
scale-free characteristics 


lis system consists of hubs and spokes and has small-world and 


32 


34| . The assort at ivity, multiplexity, and epidemic spreading 


the network all have been the themes of recent studies 


m 


SSHSTJ. In addition, constructing 


resilient air transportation systems is of utmost importance to our society in terms of re¬ 
liability. The influence of the air transportation network structure on robustness against 


perturbations has been analyzed 38|, l39|]. In addition, the bursty arrival of airplanes is a 


cause of traffic congestion in the air transportation system 


40 


41| . This is a significant 


source of destabilization in the system. It is necessary to study the extent of burstiness to 
assess its effect on the system. Nevertheless, the burstiness of the air transportation system 
is not well understood. In particular, understanding the bursty behavior in hub airports 
is of significance because airplanes’ arrivals are concentrated in hub airports due to their 
small-world characteristic. 


Therefore, in this paper, we first analyze the inter-arrival time probability distributions 
of airplanes in U.S. hub airports. Arrivals of airplanes in each hub airport correspond to 
events and the inter-arrival time is called the inter-event time (lET) in this analysis. We 
found that the lET distributions of airplanes in U.S. hub airports follow power-law tails with 
an exponent a = 2.5 and an exponential cutoff. The extent of burstiness is assessed by the 
cutoff value of the power-law. Next, the origin of the universal bursty behavior is studied. 
We study the origin of burstiness using a simple model and identify that it originates from 
airlines’ strategy to facilitate transit at hub airports. Moreover, we analyze the relation 
between each airline’s network and the extent of burstiness of the airplanes’ arrival behavior 
in its hub airports. The result indicates that the hub-and-spoke structure of the network is 
important in the bursty behavior of the system. 


The remainder of this paper is organized as follows. In Sec. HIl the empirical data of 
airplanes’ arrivals are studied. We reveal that the lET distributions in hub airports are 
power-law distributions with an exponential cutoff. In Sec. uni we investigate the origin of 
this universally observed characteristic by proposing a model describing airlines’ strategy to 
facilitate transit. In Sec. lYl we discuss the relationship between airline networks and the 
extent of burstiness in their hub airports. Section |V] provides to the conclusion. 
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FIG. 1. (Color online) The CCDFs of the lET of airplanes in three hub airports. The maximum 
and minimum probability for each lET, r, are plotted, since many lETs have the same value. The 
gray line is the exponential distribution. The lines are the theoretical results for the CODE of the 
lET of the Sine models for the parameter shown in the figure. All three distributions follow power 
laws with an exponent a = 2.5 and an exponential cutoff. The theoretical lines of the Sine models 
agree well with the empirical data. 


II. UNIVERSAL BURSTINESS IN EMPIRICAL AIR TRAFFIC DATA 


In this section, we discuss burstiness of airplanes’ arrivals using empirical data. We 
analyze the lET distributions of airplanes in the 10 largest hub airports in the U.S. based on 
passenger boardings in 2014 (see Appendix A for details on the dataset and data processing). 
The number of arrivals from all airplanes are counted for each airport. In Fig. [H the CCDFs 
of the IFTs in three major hub airports are shown. In this paper, the horizontal axis is 
the IFT divided by the average of the IFT, According to the hgure, it is universally 

observed that distributions follow power-laws with an exponent a = 2.5 and an exponential 
cutoff as 

F(r) ~ (1) 


where tq denotes the cutoff value. 

We assess the cutoff value tq as follows: consider a theoretically tractable model following 


an inhomogeneous Poisson process. Such a process is a system w 
dependent rate f(t), with each event occurring independently 


rose events occur at a time- 


42l |. Let us dehne a model 
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whose event rate is given by 


f{t) = Nasm{2mit) + 1 ((0 < t < 1) (2) 

as the Sine model, where a G M and n G M are parameters and N represents the average 
number of total events in a trial. The system starts and ends at the times t = 0 and t = 1, 
respectively. Hereafter, we discuss the Sine model taking the limit as iV —)■ oo. In this case, 
the CCDF of the lET of the Sine model is approximately given by a power-law distribution 
with an exponent a = 2.5 and an exponential cutoff when r is large, regardless of parameters 
other than a = 0 (constant event rate), tq depends only on the parameter a and is given 
by To = 1/(1 — a) (see Appendix F for details). Using this result, the logarithm of the 
theoretically calculated CCDF of the lET of the Sine model is htted to the logarithm of the 
empirical data. The lET distribution of the Sine model is uniquely determined when a is 
given, since the result is independent of n. Then, we obtain htted parameter a and calculate 
Tq. In addition, we introduce a as another metrics for the extent of the burstiness and call it 
the burst strength parameter. The parameter a is utilized to assess the extent of burstiness 
of the lET distributions in terms of the amplitude of the event rate, with 1 — a representing 
the minimum event rate. The larger the extent of the burstiness, the larger both the cutoff 
value and burst strength parameter are. 

The theoretically obtained CCDFs of the Sine models for the parameters htted with the 
empirical data are also shown in Fig. [T] Theses lines agree well with the empirical data 
including the cutoh area, which indicates that the assumption of an inhomogeneous Poisson 
process and an event rate f{t) = Nasm{2mit) + l(0<f<l, N ^ oo) is valid for htting 
the empirical data to the model. As htting to the Sine models is an appropriate method for 
assessing the extent of burstiness in the air transportation system, we utilize this method 
throughout the paper. 

The burst strength parameter, a, and the cutoh value, Tq, of the power-law distribution 
of 10 hub airports are shown in Table [H Each airport’s lATA code, the state in which it is 
located, its number of arrivals, its main carrier, and its main carrier’s share are also shown. 
The table shows that the cutoh value of airports located in a wide range of areas in the U.S. 
is large, which indicates that the lET distribution of arrivals in these airports have bursty 
behaviors. The extent of burstiness in these airports is universally large, although they 
have diherences in terms of the main-airline-operating airplanes, locations, and passengers’ 
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Airport State Ntotai Main Carrier Share a tq 


ATL 

GA 

28226 

DL 

69.25% 0.95 19.11 

LAX 

CA 

17990 

UA 

18.91% 0.83 

5.95 

ORD 

IL 

17840 

UA 

26.81% 0.84 

6.21 

DFW 

TX 

22701 

AA 

69.23% 0.90 10.00 

JFK 

NY 

7059 

B6 

37.74% 0.67 

2.99 

DEN 

CO 

17160 

WN 

26.41% 0.85 

6.59 

SFO 

CA 

13054 

UA 

39.13% 0.86 

7.40 

CLT 

NC 

9306 

US 

59.25% 0.88 

8.39 

LAS 

NV 

10807 

WN 

43.85% 0.86 

7.11 

PHX 

AZ 

13051 

US 

26.41% 0.86 

6.43 


TABLE I. The burst strength parameter, a, and the cutoff value, tq, of the power-law distribution 
of 10 major airports. Nt^tai denotes the number of total arrivals in the dataset. Although the 
locations and number of arrivals, and the main carrier of these airports are different, most airports 
have large cutoff values, which indicates that the bursty behaviors of airplanes’ arrivals in hub 
airports are universally observed. 

destinations (each airport’s cutoff value is slightly different; see Appendix B for the analysis 
of this difference in the cutoff value among hub airports.) This clearly shows that the 
mechanism generating this universal bursty behavior exists in the air transportation system. 


III. ORIGIN OF BURSTINESS IN THE AIR TRANSPORTATION SYSTEM 

A. Description of transit facilitation 

In this section, we discuss the origin of the universally observed bursty behavior. The 
mechanism behind the bursty behavior in the air transportation system is different from the 
explanations of bursty behavior in other systems. First, we consider a factor that affects the 
scheduled arrival time of each airplane to model airplanes’ arrival dynamics. Facilitation of 
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FIG. 2. (Color online) The 30-min moving averages of the numbers of arrivals and departures in 
a day. Thirty minutes is the time interval during which most passengers make a plane connection. 
Heterogeneity of the arrival and departure behaviors contributes to transit facilitation. Arrows 
represent examples of transit plans making the most of the transit facilitation strategies of airlines. 


passengers’ transit at airports is the main factor affecting the flight schedule. Transit plays an 
important role in the air transportation system. Pan et al. showed that temporal distances 
for the air transportation network are shorter than the time-shuffled model, in which the 


time stamps of all arrivals are shuffled 


43[ |. The schedule of the system is optimized to 


efficiently transport passengers. The facilitation of a passenger’s transit is an airline’s traffic 
optimization strategy, which shortens the temporal distance. 


Let us discuss the necessity of transit facilitation strategy of airlines. The air transporta¬ 
tion network is composed of hubs and spokes. This network enables passengers to travel to 
a variety of destinations via hub airports because of their small-world characteristic. Pas¬ 
sengers arriving at these airports transfer from airplanes to various other airplanes. These 
airports have a strong demand for facilitating the transit of passengers. Thus, airlines try to 
present more destinations choices to their passengers. Airlines have to prevent passengers 
from accidents such as transit failures because of the very short time for transit, while fewer 
passengers choose a connecting flight whose departure time is very long after the arrival of 
the passengers’ previous flight. 


To achieve these conditions, the following facilitation strategy is adopted: all possible 
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FIG. 3. (Color online) (a) Histogram of scheduled arrival airplanes and their 5-min moving average 
are shown in blue (gray in gray-scale) and black, respectively. Solid and dashed vertical lines 
are local minimums and maximums, respectively. The area is divided into regions by the local 
minimums. As an approximation, airplanes’ arrivals are concentrated on the local maximums 
in each region for constructing the Normal distribution model, (b) Arrival rate of the Normal 
distribution model constructed using the empirical data of ATL. 


airplanes that transit passengers might board are arranged to arrive at the airport at almost 
the same time. In addition, connecting flights also depart the airport at almost the same 
time. Let us call these times arrival/departure-concentrated times. The time interval for 
transit is long enough that passengers can have fewer accidents due to a shorter transit 
time and short enough that passengers will choose the flight as a connecting flight. These 
combinations cause passengers’ transit times to remain almost constant regardless of their 
sources and destinations, thereby facilitating transit. 

We discuss the influence of facilitation of passengers’ transit on flight schedule. Several 
flights are scheduled to arrive at the airport at the arrival-concentrated time, while not 
many airplanes are expected at other times. This factor causes airports to fluctuate the 
number of arrivals during a certain time span. The number of arrivals as a function of time, 
t, has many local peaks. In Fig. [2l the 30-min moving average of the number of arrivals and 
departures on January 31, 2014 is shown. Since most passengers take 45-75 min to make 


a plane connection 


44l |. the 30-min moving average of the number of departure flights at 


time t indicates the number of flights with which passengers arriving at the airport 60 min 
before time are able to make a connection. The hgure shows that the number of arrivals 
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and departures alternately approach their local peaks with the passage of time. Arrows 
are instances of transit plans that allows passengers to beneht from various options for 
connecting flights. For instance, passengers who arrive at the airport at around 16:30 can 
comfortably leave the airport at around 18:00 because of transit facilitation. 


B. Normal distribution model 


Next, to understand the effect of facilitation of passengers’ transit on burstiness in the 
air transportation system, we propose a simple model called the Normal distribution models 
which shows bursty behavior. The Normal distribution model is constructed as follows. As 
with the Sine model, we assume that the model follows an inhomogeneous Poisson process, 
whose arrival rate is given by a function of time, f{t). First, we mention the scheduled 
arrival rate fscheduie{t) and then we remark on the actual arrival rate/(t), with considering 
the schedule and the delay. 


The scheduled arrival rate has local peaks at the arrival-concentrated time, which de¬ 
scribes the transit facilitation strategy mentioned above. Let /ij denote the Ah arrival con¬ 
centrated time. We assume that these peaks of the scheduled arrival rate can be modeled by 
the delta function S{t — fii). The scheduled arrival rate is 0 except at the arrival-concentrated 
time. This is the extreme case of the concentration of airplanes’ arrivals. This assumption 
is valid when the peakedness of these peaks is high enough. The scheduled arrival rate is 
given by fscheduieit) = Ci5{t - /i*) (i = 1,..., n), where q are constants. 


Let us discuss the actual arrival rate fit) of the Normal distribution model. We consider 
the effect of randomness upon modeling. Although pilots aim to reach the destination at 
the scheduled time, the actual arrival time is delayed by randomness based on the weather, 
other airplanes, and so forth. Negative delay times indicate early arrivals. The arrival delay 
time distribution can be modeled by a normal distribution with a mean of —2.73 min and 

n 

a standard deviation of 13.75 min according to analysis of the empirical data |^. Thus, 
we assume that the delay time distribution is given by the normal distribution Af{fideiay, 
where fideiay = —2.73 min. and a = 13.75 min. The actual arrival time is spread out following 
normal distribution. Considering the scheduled arrival rate and delay distribution, the actual 
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arrival rate is given by the mixture of the normal distributions 


m = E 

i 




( 3 ) 


where iii = Hi + jj^deiay 

Let us mention the htting process of the Normal distribution model to the empirical 
data. We £t the empirical data for the amount of scheduled arrivals to delta functions 
fscheduieit), and then calculate the actual arrival rate f(t). First, we divide the time space 
of the empirical data into subregions and set a peak in each subregion (see Appendix C for 
details). In Fig. [3](a), the region segmentation and peak setting results for the case of ATL 
in January 2014 is shown. The histogram of scheduled arrivals and the 5-min average of 
arrivals are shown in blue (gray in gray-scale) and black, respectively. The time space is 
divided into subregions by vertical solid lines. The peak in each subregion is represented 
by a vertical dashed line. Then, the number of scheduled arrivals in each subregion, W, is 
counted. In fitting, we assume that all scheduled arrivals concentrated on the peak of each 
subregion. Thus, the height of each peak, q, is proportional to Aj. The actual arrival rate is 
obtained by substituting q into Eq. [3] for the fitted value by considering the normalization 
condition fit)dt = 1. The htting result of the arrival rate of the Normal distribution 
model using the empirical data in the ATL case is shown in Fig. [31(b). The time is rescaled 
to 0 < t < 1. 


C. Results of model analysis 

Let us discuss the simulation result for the Normal distribution model. In Fig 01 the 
CCDF of the lET of the Normal distribution model constructed using the empirical data in 
the case of ATL is shown. The lET distribution of this model is compared with those of the 
empirical data of arrivals in ATL and the Sine models for the parameters a = 0.0,1, 0. The 
lET distribution of the Normal distribution model follows a power law with an exponent 
a = 2.5 and an exponential cutoff. In addition, the lET distribution agrees well with the 
empirical data. The cutoff values of the power law and burst strength parameters of the 
model are Tq = 28.07 and a = 0.97, respectively, which are similar to those of the empirical 
data. To = 19.11 and a = 0.95. 

We theoretically discuss the CCDF of the lET of this model. There is a local minimum. 
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FIG. 4. (Color online) Comparison among the CCDFs of the lET of the Normal distribution 
model constructed with empirical data of the arrival behavior in ATL, Sine model (a = 1.0), 
(time-independent) Poisson process, and empirical data in the ATL case. The lET distribution of 
the Normal distribution model is in agreement with the empirical data and follows a power-law 
distribution with an exponent a = 2.5 and a cutoff. 


t = ti, between two consecutive peaks of normal distributions in Fig. [31(b). The event rate, 
f(t), can be expanded as f{t — ii) = Cu + Oi^it — ii)"^) asymptotically at this point given that 
/ii+i — Hi is sufficiently small compared with cXi. If the event rate is a quadratic function. 


the CCDF of the lET follows a power-law tail with an exponent a = 2.5 


311. Thus, the 


Normal distribution model can generate this lET distribution. This result indicates that the 
assumption of the Normal distribution model that transit facilitation is a factor affecting 
the airplanes’ arrival behavior is indeed the fundamental mechanism of the bursty behavior 
in the air transport system. 

The universality of the bursty behavior originates from the robustness of this mechanism 
against variations in the scheduled arrival time distribution; the reason for this robustness 
is discussed. The peakedness of the peak of the arrival rate distribution is notasso high 
as a delta function because very high traffic concentrations should be avoided because of 


limited traffic capacity. In addition, the arrival delay time distribution is asymmetric 


46|. 


These two factors affect the scheduled and actual arrival time distributions. However, the 
exponent of the power law in the lET distribution of the model remains 2.5 as long as the 
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Carrier 

N 

E 

5 

(9) 

(^) 

(0 (C) r 

G{q) G{s) Airport 

a 

To 

AA 

84 

354 

43711 

8.43 

1041 

2.01 0.53 -0.64 

0.62 

0.70 

DEW 

0.92 12.63 

DL 

134 

650 

55928 

9.70 

835 

2.11 0.44 -0.58 

0.66 

0.76 

ATL 

0.87 

7.91 

US 

81 

315 

33651 

7.78 

831 

2.15 0.50 -0.82 

0.60 

0.73 

CLT 

0.86 

7.05 

AS 

54 

207 

12169 

7.67 

451 

2.27 0.30 -0.48 

0.53 

0.64 

SEA 

0.76 

4.15 

UA 

81 

497 

37291 

12.27 

921 

2.15 0.58 -0.72 

0.62 

0.76 

ORD 

0.70 

3.29 

WN 

89 

1078 

86698 

24.22 

1948 

2.00 0.66 -0.48 

0.51 

0.59 

DEN 

0.57 

2.31 

B6 

55 

276 

17966 

10.04 

653 

2.11 0.54 -0.56 

0.54 

0.63 

JEK 

0.40 

1.67 


TABLE II. The properties of airlines’ networks, the burst strength parameter, a, and the cutoff 
value, To, of the power-law distribution in these airlines’ main hub airports. N, E, and S denote 
the number of nodes, edges without multiple edges, and edges with multiple edges, respectively. 
{q) and {s) denote the average node degree and strength, respectively. (/), (C), and r denote the 
average shortest path length, average clustering coefficient, and degree assortativity, respectively. 
G{q) and G{s) denote the degree and strength Gini coefficients, respectively. FSCs and LCCs are 
characterized by small and large cutoff values, respectively. 

delay time distribution is given by a smooth function since the event rate is expanded as 
a quadratic function at the local minimum points in this case. In addition, time intervals 
between two consecutive peaks in the event rate does not greatly affect burstiness for the 
same reason. 


IV. RELATIONSHIP BETWEEN AIRLINE NETWORKS AND THEIR BURSTY 
BEHAVIORS 


The air transportation network is made from sub layers, and these sub layers are dehned 
as each airline’s network j^|. The bursty behavior in an airport stems from each airline 
network’s burstiness. We discuss the relationship between the airline networks and the 
extent of burstiness. We study the networks of seven major airlines in the U.S. and the 
extent of burstiness of the airplanes’ arrival behavior in each airline’s main hub airport 
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FIG. 5. (Color online) The CCDFs of the lETs of major airlines’ airplanes in their main hub 
airports. Two dashed lines are the CCDFs of the lET of the Sine models for the parameters 
a = 0.0 (left) and a = 1.0 (right). The legends in the figure indicate airlines’ lATA codes and their 
main hub airports. The lET distributions of these airlines’ networks are heavy tailed compared 
with an exponential distribution while the extent of burstiness varies among these airlines. The 
lET distributions of most ESCs are characterized by power-law distributions with an exponent 
a = 2.5 and an exponential cutoff. In contrast, the lET distributions of LCCs are similar to an 
exponential distribution. 


(see Appendix A for a data description). First, we discuss the bursty behavior of arrivals 
operated by each airline in its hub airport. Figure |5] shows the CCDFs of the lETs of the 
arrivals of the these airlines’ airplanes at their main hub airports. The dashed lines on the 
left and right sides are the CCDFs of the lETs of Sine models for the parameters a = 0.0 
and a = 1.0, respectively. The hgure indicates that each lET distribution is heavy tailed 
in comparison with an exponential distribution and that some lET distributions follow a 
powerlaw distribution with an exponential cutoff. Each lET distribution varies in the cutoff 
values of its power-law tail. 

In Table m the basic properties of each airline network, the burst strength parameter, a, 
and the cutoff value, tq, of the power-law distribution of the arrival behavior in its main hub 
airport are shown. The table shows that the hve highest- and two lowest-ranking carriers in 
terms of the extent of burstiness are full-service carriers (FSCs) and low-cost carriers (LCCs), 
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respectively. Only FSCs are characterized by the power-law tails of the lET distributions. 
This indicates that the types of carriers affect the extent of burstiness. 


We investigate the reason for the difference in the extent of burstiness. The network 
structure of the air transportation system and the extent of the burstiness of its arrival 
behavior are strongly associated. The networks of LCCs are characterized by point-to-point 
networks, which are similar to the complete graph In this type of network, the necessity 
of transit is low, since most nodes are connected with each other. By contrast, the networks 
of the FSCs follow a hub-and-spoke structure, which results from the necessity of the transit- 
facilitation strategy. Figure |6] shows the relationships between the degree Gini coefficient, 
G{q); the strength Gini coefficient, G(s); and the burst strength parameter, a. The Gini 
coefficient can quantify the extent of the hub-and-spoke structure and accurately capture 


the characteristics of the FSCs’ and LCCs’ networks 


The hgure shows that the Gini 


coefficients and burst strength parameter are positively related, indicating that the larger 
the extent of the hub-and-spoke structure, the more bursty the arrival behavior. This hgure 
supports the above reason as to why only the arrival behaviors of FSCs’ airplanes are bursty. 


V. DISCUSSION 

In this paper, the burstiness of airplanes’ arrival behavior in the air transportation system 
was analyzed. First, the empirical data of airplanes’ arrival behavior in a wide range of hub 
airports were studied. It was universally observed that the CCDF of the lET in these 
airports followed power-law distributions with an exponent a = 2.5 and an exponential 
cutoff. These also agreed well with the theoretically calculated lET distributions in the case 
of an inhomogeneous Poisson process whose event rate was given by f{t) = Nasm{2nnt) + 
1 (0 < t < 1, N ^ oo) (which is called the Sine model) regardless of differences in the 
locations and main carriers of airports. The extent of the burstiness quantihed using the 
cutoff value of the power-law distribution and burst strength parameter was large in most 
airports. 

Moreover, the origin of the universally observed bursty behavior was investigated. Be¬ 
cause of the network structure of the air transportation system (the so-called hub-and-spoke 
network with small-world and scale-free characteristics), the system has a strong demand 
to facilitate transit at hub airports. Passengers can easily transfer to connecting flights 


14 




when airplanes arrive at the airports at almost the same time. This causes the number of 
airplanes’ arrivals to fluctuate and the lET distribution to follow a power-law distribution 
with an exponent a = 2.5 and an exponential cutoff. We verify this analysis by proposing 
Normal distribution model based on the mechanism mentioned above. This model is defined 
as an inhomogeneous Poisson process whose event rate is given by a mixture of normal dis¬ 
tributions. Simulation and theoretical analysis of the model indicates that it can describe 
the bursty behavior observed in the empirical data. The mechanism is robust against the 
frequency of oscillation, the peakedness of the peaks, and the on-time performance of flights. 
This robustness contributes to the universality of bursty behavior in the air transportation 
system. 

Furthermore, we studied the relationship between each airline network and the bursty 
behavior of its arrivals at its main hub airport. The analysis indicated that the extent 
of the hub-and-spoke structure of airline networks and that of the burstiness of airplanes’ 
arrivals were positively correlated. This result substantiated the mechanism for the bursty 
behavior described above. The hub-and-spoke airline networks of FSCs were characterized 
by transport of passengers from one peripheral airport to another via hub airports. Transit 
facilitation was necessary in these networks. In contrast, the extent of the hub-and-spoke 
structure of LCCs’ networks was small, since most airports were connected by direct flights. 
Thus, transit facilitation was not necessary in these networks. The fact that the cutoff 
value was large in the case of airlines with hub-and-spoke networks indicated that transit 
facilitation played a key role in the burstiness of airplanes’ arrival behavior. 

In conclusion, a universally observed bursty behavior was seen in the air transportation 
system, a human-made social network. Analyses on models and empirical data suggested 
that transit facilitation was the mechanism behind this behavior and that this mechanism 
was robust against variations of airports. The fact that many airports followed the same 
law was natural since, the system was optimized to maximize passengers’ convenience and 
carriers’ proht. In addition, the analysis of the necessity of transit facilitation by study¬ 
ing airline networks indicated that the bursty behavior originated from the hub-and-spoke 
network structure. One study has suggested that the heavy-tailed degree distribution of 
Wikipedia stems from bursty human activity These two results indicate that net¬ 

work characteristics, including small-world and scale-free, and activity behaviors such as 
burstiness are mutually correlated. The characteristic of one could not be fully understood 
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FIG. 6. The relationship between the degree Gini coefficient, G{q), (a) and the strength Gini 
coefficient, G{s), (b) and the burst strength parameter, a. LR represents the linear regression line. 
Both G{q) and G{s) are positively correlated with a. 


without considering the other. Analysis on the relationship between these two is extremely 
important. 
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APPENDIX A: DATASET AND DATA PROCESSING 


The dataset in Sec El shows the Airline On-Time Performance Data from the RITA 
database of the Bureau of Transportation Statistics (BTS) Data from the 10 largest 
airports in the U.S. based on total passenger boarding in 2014 were used in Sec El The 10 
airports were Hartsfield-Jackson Atlanta International Airport (ATL), Los Angeles Inter¬ 
national Airport (LAX), O’Hare International Airport (ORD), Dallas/Fort Worth Interna¬ 
tional Airport (DFW), John F. Kennedy International Airport (JFK), Denver International 
Airport (DEN), San Francisco International Airport (SFO), Charlotte Douglas International 
Airport (CLT), McCarran International Airport (LAS), and Phoenix Sky Harbor Interna¬ 
tional Airport (PHX). The lATA codes are written in parentheses. In addition, data from 
Seattle-Tacoma International Airport (SEA) were used in Sec IIVI Moreover, the air trans- 
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portation networks of the seven main airlines were analyzed in Sec IIVI These airlines were 
American Airlines (AA), Delta Air Lines (DL), US Airways (US), Alaska Airlines (AS), 
United Airlines (UA), Southwest Airlines (WN), and JetBlue Airways (B6). The dataset 
contains on-time performance data such as scheduled and actual arrival times, destinations, 
and carriers for non-stop domestic flights. The data were reported by carriers with at least 
1% of the total domestic scheduled passenger revenue. The data from all flights operated in 
January 2014 were used in the analysis. The data from airplanes’ arrivals and departures 
were recorded at 1 min intervals. The actual arrival times in the analysis were studied and 
canceled flights were removed from the lET distributions. The lETs across two business 
days were removed to exclude the influence of off-hours at night (see Appendix E for dis¬ 
cussion on off-hours at night). The airport data in Table |T] were also collected by BTS. The 
main carriers and their shares of airports were based on enplaned passengers (both arriving 
and departing). 


APPENDIX B: VARIATION IN THE CUTOFF VALUE IN EACH AIRPORT 

We discuss two reasons why airports vary in the cutoff values listed in Table. [B First, the 
fact that the LCCs’ networks are characterized by small cutoff values mentioned in Sec IIVI 
explains the aggregated bursty behaviors in hub airports. In Table [B WN and B6 are LCCs. 
The airport has a relatively small cutoff value if an LCC is a dominant carrier. Second, the 
main airline’s share in each airport explains the extent of burstiness in airports. In Table [B 
ATL, DEW, and CLT have quite large cutoff values. The main carriers in these airports have 
a high share. Each carrier concentrates airplanes’ arrivals to facilitate transit; however, these 
carriers seldom cooperate unless they participate in the same alliance group. The amount of 
aggregated arrivals is a summation of each carrier’s arrivals. If the majority of airplanes in 
an airport are operated by one carrier, the effect of concentration of arrivals because of the 
transit facilitation is strong, leading to a quite large cutoff value. By contrast, each carrier’s 
share is not large in other airports. This weakens the effect of transit facilitation, leading to 
a relatively small cutoff value. 

In addition, JFK has a small burst strength parameter. This is because the available 
dataset is limited to only domestic flights. In JFK, the share of international flights in 
all scheduled flights is large. As a result, the bursty behavior in JFK is not appropriately 
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FIG. 7. (Color online) The CCDFs of the inter-departure time in three hub airports. The maximum 
and minimum of the probability are plotted for each lET r. 


assessed, which results in a small cutoff value. 


APPENDIX C: DETAILS OF THE FITTING PROCESS OF THE MODEL TO 
THE EMPIRICAL DATA OF THE NORMAL DISTRIBUTION MODEL 

We discuss the fitting process in constructing the normal distribution model in Sec lIIIl We 
divide the time space of the empirical data into subregions and set a peak in each subregion 
as follows. The number of scheduled arrivals at a certain time of a day is counted. The 
data from different days are aggregated in counting for a sufficient amount of data. We 
define time as the minimum (maximum) time when the average of the number of 

scheduled arrivals at the time is minimal (maximal) in the range t — t^ange ^t< t + trange- 
Let timinS and timax^ denote the local minimum and maximum times, respectively. The 
first subregion is defined as the region from 0 to the minimum of timing and 

the ith subregion is defined by the region from tmax,i-i to the minimum of timinS such that 
^timax, tmax,i-i < kmax < Umin- Then, the ith. peak tpeak,i IS dehned by the minimum of b^axS 
in the ith subregion. The parameters of the result in Fig [3] are t^v = 5 min and trange = 10 
min. In this case, the time space has 16 subregions. 
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FIG. 8. (Color online) The theoretically calculated CCDFs of the lET of the Sine models for 
the parameters a = 0.0,0.5, 0.9, and 1.0. The event rate of the Sine model is given by f{t) = 
Nasm.{2mit) + 1 ((0 < t < 1, N ^ oo). When a = 1, the lET distribution follows a power law 
with an exponent a = 2.5 without any cutoffs. Otherwise, the lET distribution follows a power 
law with an exponential cutoff. When a = 0, the lET distribution is identical to the exponential 
distribution. 


APPENDIX D: BURSTINESS IN AIRPLANES’ DEPARTURES 

The bursty behavior of airplanes’ arrivals is studied in Sec. HIl In this Appendix, we study 
that of airplanes’ departures. The CCDFs of the inter-departure time of airplanes in three 
hub airports are shown in Fig. [71 The distributions follow power laws with exponential 
cutoffs, just as inter-arrival time distributions do. This result indicates that airplanes’ 
departure behavior also obeys the same mechanism generating burstiness. 

However, compared with inter-arrival time distributions, the slopes of the inter-departure 
time distributions on a log-log plot are slightly steeper. This indicates that the event rate 
expands as f{t — ii) = cii + 0{{t — iiY) asymptotically where n < 2 at local minimum points 
if we assume that the airplanes’ departures follow an inhomogeneous Poisson process j^ . 
This result suggests that the delay distribution is not a smooth function because of artihcial 
controls. It is difficult to artificially control the arrival times of airplanes, which results 
in the smooth event rate. By contrast, since airplanes are on the ground, it is relatively 
undemanding to control the waiting time before airplanes take off, especially when runways 
are not fully utilized. Since runways are likely to be vacant near the local minimum points. 
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the effect of this artihcial control makes the delay time distribution unsmooth when the 
event rate is low and the resulting r is large. 


APPENDIX E: EXISTENCE OF INACTIVE REGIONS IN EMPIRICAL DATA 

We discuss a model that follows an inhomogeneous Poisson process. If the event rate, 
f{t), is positive, the lET converges to 0 in the limit iV —)■ cxd, where N is the average number 
of total events in a trial. However, if there is a region to,min < t < to.maa: where f(t) = 0, 
the time interval between the last event before t = to,min and the first event after t = to,maa; 
is hnite, even in the limit N —)■ oo. Thus, the rescaled lET r/(r) diverges to inhnity in this 
limit. However, since the number of such regions is also hnite, the percentage of the inhnite 
rescaled lETs converges to 0 in this limit. Thus, it is not necessary to consider these regions 
if we take the limit. However, the number of events is hnite in reality and these regions 
ahect the result. We call the region with explicitly f{t) ^ 0 and fit) = 0 as active and 
inactive regions, respectively. Most social behaviors have inactive regions, such as late at 
night when most people sleep. In the case of the air transportation system, no airplanes hy 
in most airports at night. 


APPENDIX F: THEORETICAL ANALYSIS OF THE SINE MODEL 


We theoretically derive the CCDF of the lET of the Sine model in this Appendix. The 
event rate of the Sine model is f{f) = Vasin(2n7rt) + 1 ((0 < t < 1). In the limit N ^ oo, 
the event can be considered to occur every time. The rescaled lET at time t is given by 
r/(r) = X/f{t), where X = NAx is a stochastic variable whose PDF is P{X) = e~^ and 
l//(t) represents the average event interval. Then, considering the whole time period, the 
distribution of t/{t)) is given by the product of the distributions of X and l/f(t). Then, 
we obtain 





_|_ g-(l+a)T/(T)^ 


(Rl) 
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This result is independent of n since the distribution of 1/f{t) is the same regardless of n. 
Moreover, when r is sufficiently large, we obtain the approximate solution 


P{t/{t)) ~ 


1 + 3a + 8a(l — a)r/(r) 





(F.2) 


using Taylor expansion of fnnctions from a; = 0, cntting off high-order terms, changing upper 
the limit of interval of integration in the first term to oo, and ignoring the second term. Since 
the change in the valne of -|- 3a -|- 8a(l — a)r]/8\/^a^/^ is small compared with 

when T < 1/(1 — a), this result indicates that the CCDF of the lET of the Sine model 
follows a power-law fnnction with exponent a = 5/2 and a cntoff tq = 1/(1 — a). 

The CCDF of the lET is shown in Fig. [HI A large parameter a reflects a heavy-tailed 
lET distribution. The exponent of the power law is 2.5 and independent of the parameter 
a. However, the larger the parameter a, the larger the cutoff value of the lET distribution. 
When a = 0, the lET distribntion is given by an exponential distribution. In the case of 
a > 0, the lET distribntion is approximately given by a power-law distribntion with an 
exponent a = 2.5 and an exponential cutoff when r is large. When a = 1, the cutoff of the 
lET distribution vanishes. The parameters a = 0.5, 0.9, and 1.0 corresponds to the cutoff 
values Tq = 2.0,10.0, and oo. Any distribntions between these extreme cases can be assessed 
using this parameter. 
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